AITopics | open world

Collaborating Authors

open world

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Augment Distributions for Out-of-distribution Detection

Neural Information Processing SystemsDec-27-2025, 02:29:33 GMT

Open-world classification systems should discern out-of-distribution (OOD) data whose labels deviate from those of in-distribution (ID) cases, motivating recent studies in OOD detection. Advanced works, despite their promising progress, may still fail in the open world, owing to the lacking knowledge about unseen OOD data in advance. Although one can access auxiliary OOD data (distinct from unseen ones) for model training, it remains to analyze how such auxiliary data will work in the open world. To this end, we delve into such a problem from a learning theory perspective, finding that the distribution discrepancy between the auxiliary and the unseen real OOD data is the key to affect the open-world detection performance. Accordingly, we propose Distributional-Augmented OOD Learning (DAOL), alleviating the OOD distribution discrepancy by crafting an OOD distribution set that contains all distributions in a Wasserstein ball centered on the auxiliary OOD distribution. We justify that the predictor trained over the worst OOD data in the ball can shrink the OOD distribution discrepancy, thus improving the open-world detection performance given only the auxiliary OOD data. We conduct extensive evaluations across representative OOD detection setups, demonstrating the superiority of our DAOL over its advanced counterparts.

augment distribution, learning, ood data, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

Add feedback

Requirements for Recognition and Rapid Response to Unfamiliar Events Outside of Agent Design Scope

Wray, Robert E., Jones, Steven J., Laird, John E.

arXiv.org Artificial IntelligenceJun-17-2025

Regardless of past learning, an agent in an open world will face unfamiliar events outside of prior experience, existing models, or policies. Further, the agent will sometimes lack relevant knowledge and/or sufficient time to assess the situation and evaluate response options. How can an agent respond reasonably to situations that are outside of its original design scope? How can it recognize such situations sufficiently quickly and reliably to determine reasonable, adaptive courses of action? We identify key characteristics needed for solutions, review the state-of-the-art, and outline a proposed, novel approach that combines domain-general meta-knowledge (inspired by human cognition) and metareason-ing. This approach offers potential for fast, adaptive responses to unfamiliar situations, more fully meeting the performance characteristics required for open-world, general agents.

agent, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2504.12497

Country: North America > United States (1.00)

Genre: Research Report (0.84)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Add feedback

MindsEye review – a dystopian future that plays like it's from 2012

The GuardianJun-13-2025, 12:25:36 GMT

It's pretty much a straight copy of the original: a huge soap bubble, half sunk into the desert floor, with its surface turned into a gigantic TV. Occasionally you'll pull up near the Sphere while driving an electric vehicle made by Silva, the megacorp that controls this world. You'll sometimes come to a stop just as an advert for an identical Silva EV plays out on the huge curved screen overhead. The doubling effect can be slightly vertigo-inducing. At these moments, I truly get what MindsEye is trying to do.

artificial intelligence, dystopian future, open world, (4 more...)

The Guardian

Country: North America > United States > Nevada > Clark County > Las Vegas (0.05)

Industry:

Transportation > Ground > Road (0.71)
Transportation > Electric Vehicle (0.56)

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

OpenSDI: Spotting Diffusion-Generated Images in the Open World

Wang, Yabin, Huang, Zhiwu, Hong, Xiaopeng

arXiv.org Artificial IntelligenceApr-17-2025

This paper identifies OpenSDI, a challenge for spotting diffusion-generated images in open-world settings. In response to this challenge, we define a new benchmark, the OpenSDI dataset (OpenSDID), which stands out from existing datasets due to its diverse use of large vision-language models that simulate open-world diffusion-based manipulations. Another outstanding feature of OpenSDID is its inclusion of both detection and localization tasks for images manipulated globally and locally by diffusion models. To address the OpenSDI challenge, we propose a Synergizing Pretrained Models (SPM) scheme to build up a mixture of foundation models. This approach exploits a collaboration mechanism with multiple pretrained foundation models to enhance generalization in the OpenSDI context, moving beyond traditional training by synergizing multiple pretrained models through prompting and attending strategies. Building on this scheme, we introduce MaskCLIP, an SPM-based model that aligns Contrastive Language-Image Pre-Training (CLIP) with Masked Autoencoder (MAE). Extensive evaluations on OpenSDID show that MaskCLIP significantly outperforms current state-of-the-art methods for the OpenSDI challenge, achieving remarkable relative improvements of 14.23% in IoU (14.11% in F1) and 2.05% in accuracy (2.38% in F1) compared to the second-best model in localization and detection tasks, respectively. Our dataset and code are available at https://github.com/iamwangyabin/OpenSDI.

artificial intelligence, machine learning, spotting diffusion-generated image, (2 more...)

arXiv.org Artificial Intelligence

2503.19653

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.87)

Add feedback

OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad

Tang, Luyao, Yuan, Yuxuan, Chen, Chaoqi, Zhang, Zeyu, Huang, Yue, Zhang, Kun

arXiv.org Artificial IntelligenceMar-24-2025

Although foundation models (FMs) claim to be powerful, their generalization ability significantly decreases when faced with distribution shifts, weak supervision, or malicious attacks in the open world. On the other hand, most domain generalization or adversarial fine-tuning methods are task-related or model-specific, ignoring the universality in practical applications and the transferability between FMs. This paper delves into the problem of generalizing FMs to the out-of-domain data. We propose a novel framework, the Object-Concept-Relation Triad (OCRT), that enables FMs to extract sparse, high-level concepts and intricate relational structures from raw visual inputs. The key idea is to bind objects in visual scenes and a set of object-centric representations through unsupervised decoupling and iterative refinement. To be specific, we project the object-centric representations onto a semantic concept space that the model can readily interpret and estimate their importance to filter out irrelevant elements. Then, a concept-based graph, which has a flexible degree, is constructed to incorporate the set of concepts and their corresponding importance, enabling the extraction of high-order factors from informative concepts and facilitating relational reasoning among these concepts. Extensive experiments demonstrate that OCRT can substantially boost the generalizability and robustness of SAM and CLIP across multiple downstream tasks.

large language model, machine learning, ocrt, (18 more...)

arXiv.org Artificial Intelligence

2503.18695

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Fujian Province > Xiamen (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Genre: Research Report (0.90)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Learning to Augment Distributions for Out-of-distribution Detection

Neural Information Processing SystemsJan-20-2025, 01:07:19 GMT

distribution discrepancy, ood data, out-of-distribution detection, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Li, Zizhao, Xiang, Zhengkang, West, Joseph, Khoshelham, Kourosh

arXiv.org Artificial IntelligenceDec-1-2024

Traditional object detection methods operate under the closed-set assumption, where models can only detect a fixed number of objects predefined in the training set. Recent works on open vocabulary object detection (OVD) enable the detection of objects defined by an unbounded vocabulary, which reduces the cost of training models for specific tasks. However, OVD heavily relies on accurate prompts provided by an ''oracle'', which limits their use in critical applications such as driving scene perception. OVD models tend to misclassify near-out-of-distribution (NOOD) objects that have similar semantics to known classes, and ignore far-out-of-distribution (FOOD) objects. To address theses limitations, we propose a framework that enables OVD models to operate in open world settings, by identifying and incrementally learning novel objects. To detect FOOD objects, we propose Open World Embedding Learning (OWEL) and introduce the concept of Pseudo Unknown Embedding which infers the location of unknown classes in a continuous semantic space based on the information of known classes. We also propose Multi-Scale Contrastive Anchor Learning (MSCAL), which enables the identification of misclassified unknown objects by promoting the intra-class consistency of object embeddings at different scales. The proposed method achieves state-of-the-art performance in common open world object detection and autonomous driving benchmarks.

data mining, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.18207

Country:

North America > United States (0.04)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology (0.48)
Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Open-World Visual Reasoning by a Neuro-Symbolic Program of Zero-Shot Symbols

Burghouts, Gertjan, Hillerström, Fieke, Walraven, Erwin, van Bekkum, Michael, Ruis, Frank, Sijs, Joris, van Mil, Jelle, Dijk, Judith

arXiv.org Artificial IntelligenceJul-18-2024

We consider the problem of finding spatial configurations of multiple objects in images, e.g., a mobile inspection robot is tasked to localize abandoned tools on the floor. We define the spatial configuration of objects by first-order logic in terms of relations and attributes. A neuro-symbolic program matches the logic formulas to probabilistic object proposals for the given image, provided by language-vision models by querying them for the symbols. This work is the first to combine neuro-symbolic programming (reasoning) and language-vision models (learning) to find spatial configurations of objects in images in an open world setting. We show the effectiveness by finding abandoned tools on floors and leaking pipes. We find that most prediction errors are due to biases in the language-vision model.

configuration, language-vision model, spatial configuration, (16 more...)

arXiv.org Artificial Intelligence

2407.13382

Country: Europe > Netherlands > South Holland > The Hague (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.35)

Add feedback

BOWLL: A Deceptively Simple Open World Lifelong Learner

Kamath, Roshni, Mitchell, Rupert, Paul, Subarnaduti, Kersting, Kristian, Mundt, Martin

arXiv.org Artificial IntelligenceFeb-7-2024

The quest to improve scalar performance numbers on predetermined benchmarks seems to be deeply engraved in deep learning. However, the real world is seldom carefully curated and applications are seldom limited to excelling on test sets. A practical system is generally required to recognize novel concepts, refrain from actively including uninformative data, and retain previously acquired knowledge throughout its lifetime. Despite these key elements being rigorously researched individually, the study of their conjunction, open world lifelong learning, is only a recent trend. To accelerate this multifaceted field's exploration, we introduce its first monolithic and much-needed baseline. Leveraging the ubiquitous use of batch normalization across deep neural networks, we propose a deceptively simple yet highly effective way to repurpose standard models for open world lifelong learning. Through extensive empirical evaluation, we highlight why our approach should serve as a future standard for models that are able to effectively maintain their knowledge, selectively focus on informative data, and accelerate future learning.

bowll, learning, memory buffer, (13 more...)

arXiv.org Artificial Intelligence

2402.04814

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (0.92)

Industry: Education > Educational Setting (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Open World Object Detection in the Era of Foundation Models

Zohar, Orr, Lozano, Alejandro, Goel, Shelly, Yeung, Serena, Wang, Kuan-Chieh

arXiv.org Artificial IntelligenceDec-9-2023

Object detection is integral to a bevy of real-world applications, from robotics to medical image analysis. To be used reliably in such applications, models must be capable of handling unexpected - or novel - objects. The open world object detection (OWD) paradigm addresses this challenge by enabling models to detect unknown objects and learn discovered ones incrementally. However, OWD method development is hindered due to the stringent benchmark and task definitions. These definitions effectively prohibit foundation models. Here, we aim to relax these definitions and investigate the utilization of pre-trained foundation models in OWD. First, we show that existing benchmarks are insufficient in evaluating methods that utilize foundation models, as even naive integration methods nearly saturate these benchmarks. This result motivated us to curate a new and challenging benchmark for these models. Therefore, we introduce a new benchmark that includes five real-world application-driven datasets, including challenging domains such as aerial and surgical images, and establish baselines. We exploit the inherent connection between classes in application-driven datasets and introduce a novel method, Foundation Object detection Model for the Open world, or FOMO, which identifies unknown objects based on their shared attributes with the base known objects. FOMO has ~3x unknown object mAP compared to baselines on our benchmark. However, our results indicate a significant place for improvement - suggesting a great research opportunity in further scaling object detection methods to real-world domains. Our code and benchmark are available at https://orrzohar.github.io/projects/fomo/.

benchmark, dataset, detection, (16 more...)

arXiv.org Artificial Intelligence

2312.05745

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Floriana (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Leisure & Entertainment > Sports (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback